Online Learning and Optimization of Markov Jump Affine Models
نویسندگان
چکیده
The problem of online learning and optimization of unknown Markov jump affine models is considered. An online learning policy, referred to as Markovian simultaneous perturbations stochastic approximation (MSPSA), is proposed for two different optimization objectives: (i) the quadratic cost minimization of the regulation problem and (ii) the revenue (profit) maximization problem. It is shown that the regret of MSPSA grows at the order of the square root of the learning horizon. Furthermore, by the use of van Trees inequality, it is shown that the regret of any policy grows no slower than that of MSPSA, making MSPSA an order optimal learning policy. In addition, it is also shown that the MSPSA policy converges to the optimal control input almost surely as well as in the mean square sense. Simulation results are presented to illustrate the regret growth rate of MSPSA and to show that MSPSA can offer significant gain over the greedy certainty equivalent approach.
منابع مشابه
Adaptive MCMC Methods for Inference on Dis- cretely Observed Affine Jump Diffusion Models
In the present paper we generalize in a Bayesian framework the inferential solution proposed by Eraker, Johannes & Polson (2003) for stochastic volatility models with jumps and affine structure. We will use an adaptive sampling methodology known as Delayed Rejection suggested in Tierney & Mira (1999) in a Markov Chain Monte Carlo settings in order to reduce the asymptotic variance of the estima...
متن کاملApplication of Stochastic Optimal Control, Game Theory and Information Fusion for Cyber Defense Modelling
The present paper addresses an effective cyber defense model by applying information fusion based game theoretical approaches. In the present paper, we are trying to improve previous models by applying stochastic optimal control and robust optimization techniques. Jump processes are applied to model different and complex situations in cyber games. Applying jump processes we propose some m...
متن کاملA Higher Order Online Lyapunov-Based Emotional Learning for Rough-Neural Identifiers
o enhance the performances of rough-neural networks (R-NNs) in the system identification, on the base of emotional learning, a new stable learning algorithm is developed for them. This algorithm facilitates the error convergence by increasing the memory depth of R-NNs. To this end, an emotional signal as a linear combination of identification error and its differences is used to achie...
متن کاملA New Fuzzy Stabilizer Based on Online Learning Algorithm for Damping of Low-Frequency Oscillations
A multi objective Honey Bee Mating Optimization (HBMO) designed by online learning mechanism is proposed in this paper to optimize the double Fuzzy-Lead-Lag (FLL) stabilizer parameters in order to improve low-frequency oscillations in a multi machine power system. The proposed double FLL stabilizer consists of a low pass filter and two fuzzy logic controllers whose parameters can be set by the ...
متن کاملFitting Jump Models
We describe a new framework for fitting jump models to a sequence of data. The key idea is to alternate between minimizing a loss function to fit multiple model parameters, and minimizing a discrete loss function to determine which set of model parameters is active at each data point. The framework is quite general and encompasses popular classes of models, such as hidden Markov models and piec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1605.02213 شماره
صفحات -
تاریخ انتشار 2016